Àá½Ã¸¸ ±â´Ù·Á ÁÖ¼¼¿ä. ·ÎµùÁßÀÔ´Ï´Ù.
KMID : 1022420150070010003
Phonetics and Speech Sciences
2015 Volume.7 No. 1 p.3 ~ p.10
Evaluation of Frequency Warping Based Features and Spectro-Temporal Features for Speaker Recognition
Choi Young-Ho

Ban Sung-Min
Kim Kyung-Wha
Kim Hyung-Soon
Abstract
In this paper, different frequency scales in cepstral feature extraction are evaluated for the text-independent speaker recognition. To this end, mel-frequency cepstral coefficients (MFCCs), linear frequency cepstral coefficients (LFCCs), and
bilinear warped frequency cepstral coefficients (BWFCCs) are applied to the speaker recognition experiment. In addition, the spectro-temporal features extracted by the cepstral-time matrix (CTM) are examined as an alternative to the delta and
delta-delta features. Experiments on the NIST speaker recognition evaluation (SRE) 2004 task are carried out using the
Gaussian mixture model-universal background model (GMM-UBM) method and the joint factor analysis (JFA) method, both based on the ALIZE 3.0 toolkit. Experimental results using both the methods show that BWFCC with appropriate warping factor yields better performance than MFCC and LFCC. It is also shown that the feature set including the spectro-temporal information based on the CTM outperforms the conventional feature set including the delta and delta-delta features.
KEYWORD
speaker recognition, GMM-UBM, JFA, MFCC, LFCC, BWFCC, delta feature, cepstral-time matrix
FullTexts / Linksout information
Listed journal information
ÇмúÁøÈïÀç´Ü(KCI)